<!DOCTYPE html>
<html lang="en">
<head>
	<meta charset="UTF-8">
	<meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Create datafeeds API | ElasticSearch 7.7 权威指南中文版</title>
	<meta name="keywords" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <meta name="description" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <!-- Give IE8 a fighting chance -->
    <!--[if lt IE 9]>
    <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
    <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
	<link rel="stylesheet" type="text/css" href="../static/styles.css" />
	<script>
	var _link = 'ml-put-datafeed.html';
    </script>
</head>
<body>
<div class="main-container">
    <section id="content">
        <div class="content-wrapper">
            <section id="guide" lang="zh_cn">
                <div class="container">
                    <div class="row">
                        <div class="col-xs-12 col-sm-8 col-md-8 guide-section">
                            <div style="color:gray; word-break: break-all; font-size:12px;">原英文版地址: <a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.7/ml-put-datafeed.html" rel="nofollow" target="_blank">https://www.elastic.co/guide/en/elasticsearch/reference/7.7/ml-put-datafeed.html</a>, 原文档版权归 www.elastic.co 所有<br/>本地英文版地址: <a href="../en/ml-put-datafeed.html" rel="nofollow" target="_blank">../en/ml-put-datafeed.html</a></div>
                        <!-- start body -->
                  <div class="page_header">
<strong>重要</strong>: 此版本不会发布额外的bug修复或文档更新。最新信息请参考 <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html" rel="nofollow">当前版本文档</a>。
</div>
<div id="content">
<div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.html">Elasticsearch Guide [7.7]</a></span>
»
<span class="breadcrumb-link"><a href="rest-apis.html">REST APIs</a></span>
»
<span class="breadcrumb-link"><a href="ml-apis.html">Machine learning anomaly detection APIs</a></span>
»
<span class="breadcrumb-node">Create datafeeds API</span>
</div>
<div class="navheader">
<span class="prev">
<a href="ml-put-calendar.html">« Create calendar API</a>
</span>
<span class="next">
<a href="ml-put-filter.html">Create filter API »</a>
</span>
</div>
<div class="section xpack">
<div class="titlepage"><div><div>
<h2 class="title">
<a id="ml-put-datafeed"></a>Create datafeeds API<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/ml/anomaly-detection/apis/put-datafeed.asciidoc">edit</a><a class="xpack_tag" href="https://www.elastic.co/subscriptions"></a>
</h2>
</div></div></div>

<p>Instantiates a datafeed.</p>
<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="ml-put-datafeed-request"></a>Request<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/ml/anomaly-detection/apis/put-datafeed.asciidoc">edit</a>
</h3>
</div></div></div>
<p><code class="literal">PUT _ml/datafeeds/&lt;feed_id&gt;</code></p>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="ml-put-datafeed-prereqs"></a>Prerequisites<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/ml/anomaly-detection/apis/put-datafeed.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
You must create an anomaly detection job before you create a datafeed.
</li>
<li class="listitem">
If Elasticsearch security features are enabled, you must have <code class="literal">manage_ml</code> or <code class="literal">manage</code>
cluster privileges to use this API. See
<a class="xref" href="security-privileges.html" title="Security privileges">Security privileges</a>.
</li>
</ul>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="ml-put-datafeed-desc"></a>Description<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/ml/anomaly-detection/apis/put-datafeed.asciidoc">edit</a>
</h3>
</div></div></div>
<p><a href="https://www.elastic.co/guide/en/machine-learning/7.7/ml-dfeeds.html" class="ulink" target="_top">Datafeeds</a> retrieve data from Elasticsearch for analysis by
an anomaly detection job. You can associate only one datafeed to each anomaly detection job.</p>
<p>The datafeed contains a query that runs at a defined interval (<code class="literal">frequency</code>). If
you are concerned about delayed data, you can add a delay (<code class="literal">query_delay</code>) at
each interval. See <a href="https://www.elastic.co/guide/en/machine-learning/7.7/ml-delayed-data-detection.html" class="ulink" target="_top">Handling delayed data</a>.</p>
<div class="important admon">
<div class="icon"></div>
<div class="admon_content">
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
You must use Kibana or this API to create a datafeed. Do not put a
datafeed directly to the <code class="literal">.ml-config</code> index using the Elasticsearch index API. If Elasticsearch
security features are enabled, do not give users <code class="literal">write</code> privileges on the
<code class="literal">.ml-config</code> index.
</li>
<li class="listitem">
When Elasticsearch security features are enabled, your datafeed remembers which roles
the user who created it had at the time of creation and runs the query using
those same roles.
</li>
</ul>
</div>
</div>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="ml-put-datafeed-path-parms"></a>Path parameters<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/ml/anomaly-detection/apis/put-datafeed.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">&lt;feed_id&gt;</code>
</span>
</dt>
<dd>
(Required, string)
A numerical character string that uniquely identifies the
datafeed. This identifier can contain lowercase alphanumeric characters (a-z
and 0-9), hyphens, and underscores. It must start and end with alphanumeric
characters.
</dd>
</dl>
</div>
</div>

<div class="section child_attributes">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="ml-put-datafeed-request-body"></a>Request body<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/ml/anomaly-detection/apis/put-datafeed.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">aggregations</code>
</span>
</dt>
<dd>
(Optional, object)
If set, the datafeed performs aggregation searches. Support for aggregations is
limited and should be used only with low cardinality data. For more information,
see
<a href="https://www.elastic.co/guide/en/machine-learning/7.7/ml-configuring-aggregation.html" class="ulink" target="_top">Aggregating data for faster performance</a>.
</dd>
<dt>
<span class="term">
<code class="literal">chunking_config</code>
</span>
</dt>
<dd>
<p>
(Optional, object)
Datafeeds might be required to search over long time periods, for several
months or years. This search is split into time chunks in order to ensure the
load on Elasticsearch is managed. Chunking configuration controls how the size of these
time chunks are calculated and is an advanced configuration option.
</p>
<details open>
<summary class="title">Properties of <code class="literal">chunking_config</code></summary>
<div class="content">
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">mode</code>
</span>
</dt>
<dd>
<p>
(string)
There are three available modes:
</p>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
<code class="literal">auto</code>: The chunk size is dynamically calculated. This is the default and
recommended value.
</li>
<li class="listitem">
<code class="literal">manual</code>: Chunking is applied according to the specified <code class="literal">time_span</code>.
</li>
<li class="listitem">
<code class="literal">off</code>: No chunking is applied.
</li>
</ul>
</div>
</dd>
<dt>
<span class="term">
<code class="literal">time_span</code>
</span>
</dt>
<dd>
(<a class="xref" href="common-options.html#time-units" title="Time units">time units</a>)
The time span that each search will be querying. This setting is only applicable
when the mode is set to <code class="literal">manual</code>. For example: <code class="literal">3h</code>.
</dd>
</dl>
</div>
</div>
</details>
</dd>
<dt>
<span class="term">
<code class="literal">delayed_data_check_config</code>
</span>
</dt>
<dd>
<p>
(Optional, object)
Specifies whether the datafeed checks for missing data and the size of the
window. For example: <code class="literal">{"enabled": true, "check_window": "1h"}</code>.
</p>
<p>The datafeed can optionally search over indices that have already been read in
an effort to determine whether any data has subsequently been added to the
index. If missing data is found, it is a good indication that the <code class="literal">query_delay</code>
option is set too low and the data is being indexed after the datafeed has passed
that moment in time. See
<a href="https://www.elastic.co/guide/en/machine-learning/7.7/ml-delayed-data-detection.html" class="ulink" target="_top">Working with delayed data</a>.</p>
<p>This check runs only on real-time datafeeds.</p>
<details open>
<summary class="title">Properties of <code class="literal">delayed_data_check_config</code></summary>
<div class="content">
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">check_window</code>
</span>
</dt>
<dd>
(<a class="xref" href="common-options.html#time-units" title="Time units">time units</a>) The window of time that is searched for late data.
This window of time ends with the latest finalized bucket. It defaults to
<code class="literal">null</code>, which causes an appropriate <code class="literal">check_window</code> to be calculated when the
real-time datafeed runs. In particular, the default <code class="literal">check_window</code> span
calculation is based on the maximum of <code class="literal">2h</code> or <code class="literal">8 * bucket_span</code>.
</dd>
<dt>
<span class="term">
<code class="literal">enabled</code>
</span>
</dt>
<dd>
(boolean) Specifies whether the datafeed periodically checks for delayed data.
Defaults to <code class="literal">true</code>.
</dd>
</dl>
</div>
</div>
</details>
</dd>
<dt>
<span class="term">
<code class="literal">frequency</code>
</span>
</dt>
<dd>
(Optional, <a class="xref" href="common-options.html#time-units" title="Time units">time units</a>)
The interval at which scheduled queries are made while the datafeed runs in real
time. The default value is either the bucket span for short bucket spans, or,
for longer bucket spans, a sensible fraction of the bucket span. For example:
<code class="literal">150s</code>. When <code class="literal">frequency</code> is shorter than the bucket span, interim results for
the last (partial) bucket are written then eventually overwritten by the full
bucket results. If the datafeed uses aggregations, this value must be divisible
by the interval of the date histogram aggregation.
</dd>
<dt>
<span class="term">
<code class="literal">indices</code>
</span>
</dt>
<dd>
<p>
(Required, array)
An array of index names. Wildcards are supported. For example:
<code class="literal">["it_ops_metrics", "server*"]</code>.
</p>
<div class="note admon">
<div class="icon"></div>
<div class="admon_content">
<p>If any indices are in remote clusters then <code class="literal">node.remote_cluster_client</code>
must not be set to <code class="literal">false</code> on any machine learning nodes.</p>
</div>
</div>
</dd>
<dt>
<span class="term">
<code class="literal">job_id</code>
</span>
</dt>
<dd>
(Required, string)
Identifier for the anomaly detection job.
</dd>
<dt>
<span class="term">
<code class="literal">max_empty_searches</code>
</span>
</dt>
<dd>
(Optional,integer)
If a real-time datafeed has never seen any data (including during any initial
training period) then it will automatically stop itself and close its associated
job after this many real-time searches that return no documents. In other words,
it will stop after <code class="literal">frequency</code> times <code class="literal">max_empty_searches</code> of real-time
operation. If not set then a datafeed with no end time that sees no data will
remain started until it is explicitly stopped. By default this setting is not
set.
</dd>
<dt>
<span class="term">
<code class="literal">query</code>
</span>
</dt>
<dd>
(Optional, object)
The Elasticsearch query domain-specific language (DSL). This value corresponds to the
query object in an Elasticsearch search POST body. All the options that are supported by
Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this
property has the following value: <code class="literal">{"match_all": {"boost": 1}}</code>.
</dd>
<dt>
<span class="term">
<code class="literal">query_delay</code>
</span>
</dt>
<dd>
(Optional, <a class="xref" href="common-options.html#time-units" title="Time units">time units</a>)
The number of seconds behind real time that data is queried. For example, if
data from 10:04 a.m. might not be searchable in Elasticsearch until 10:06 a.m., set this
property to 120 seconds. The default value is randomly selected between <code class="literal">60s</code>
and <code class="literal">120s</code>. This randomness improves the query performance when there are
multiple jobs running on the same node. For more information, see
<a href="https://www.elastic.co/guide/en/machine-learning/7.7/ml-delayed-data-detection.html" class="ulink" target="_top">Handling delayed data</a>.
</dd>
<dt>
<span class="term">
<code class="literal">script_fields</code>
</span>
</dt>
<dd>
(Optional, object)
Specifies scripts that evaluate custom expressions and returns script fields to
the datafeed. The detector configuration objects in a job can contain functions
that use these script fields. For more information, see
<a href="https://www.elastic.co/guide/en/machine-learning/7.7/ml-configuring-transform.html" class="ulink" target="_top">Transforming data with script fields</a>
and <a class="xref" href="search-request-body.html#request-body-search-script-fields" title="Script Fields">Script fields</a>.
</dd>
<dt>
<span class="term">
<code class="literal">scroll_size</code>
</span>
</dt>
<dd>
(Optional, unsigned integer)
The <code class="literal">size</code> parameter that is used in Elasticsearch searches. The default value is <code class="literal">1000</code>.
</dd>
<dt>
<span class="term">
<code class="literal">indices_options</code>
</span>
</dt>
<dd>
<p>
(Optional, object)
Specifies index expansion options that are used during search.
</p>
<p>For example:</p>
<pre class="screen">{
   "expand_wildcards": ["all"],
   "ignore_unavailable": true,
   "allow_no_indices": "false",
   "ignore_throttled": true
}</pre>
<p>For more information about these options, see <a class="xref" href="multi-index.html" title="Multiple indices">Multiple indices</a>.</p>
</dd>
</dl>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="ml-put-datafeed-example"></a>Examples<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/ml/anomaly-detection/apis/put-datafeed.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">PUT _ml/datafeeds/datafeed-total-requests
{
  "job_id": "total-requests",
  "indices": ["server-metrics"]
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/1801.console"></div>
<p>When the datafeed is created, you receive the following results:</p>
<div class="pre_wrapper lang-console-result">
<pre class="programlisting prettyprint lang-console-result">{
  "datafeed_id": "datafeed-total-requests",
  "job_id": "total-requests",
  "query_delay": "83474ms",
  "indices": [
    "server-metrics"
  ],
  "query": {
    "match_all": {
      "boost": 1.0
    }
  },
  "scroll_size": 1000,
  "chunking_config": {
    "mode": "auto"
  }
}</pre>
</div>
</div>

</div>
<div class="navfooter">
<span class="prev">
<a href="ml-put-calendar.html">« Create calendar API</a>
</span>
<span class="next">
<a href="ml-put-filter.html">Create filter API »</a>
</span>
</div>
</div>

                  <!-- end body -->
                        </div>
                        <div class="col-xs-12 col-sm-4 col-md-4" id="right_col">
                        
                        </div>
                    </div>
                </div>
            </section>
        </div>
    </section>
</div>
<script src="../static/cn.js"></script>
</body>
</html>